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Abstract 

We present a fast, adaptive multiresolution algorithm for applying integral operators 
| 7 | . with a wide class of radially symmetric kernels in dimensions one, two and three. 

f-n* \ This algorithm is made efficient by the use of separated representations of the kernel. 

We discuss operators of the class (—A + fi 2 I)~ a , where \i > and < a < 3/2, 
and illustrate the algorithm for the Poisson and Schrodinger equations in dimension 
three. The same algorithm may be used for all operators with radially symmetric 
kernels approximated as a weighted sum of Gaussians, making it applicable across 
multiple fields by reusing a single implementation. 

^"T* ■ This fast algorithm provides controllable accuracy at a reasonable cost, compa- 

! rable to that of the Fast Multipole Method (FMM). It differs from the FMM by 

the type of approximation used to represent kernels and has an advantage of being 

\Q • easily extendable to higher dimensions. 

O " 

o 
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1 Introduction 



For a number of years, the Fast Multipole Method (FMM) [1,2,3] has been 
the method of choice for applying integral operators to functions in dimen- 
sions d < 3. On the other hand, multiresolution algorithms in wavelet and 
multiwavelet bases introduced in [4] for the same purpose were not efficient 
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enough to be practical in more than one dimension. Recently, with the in- 
troduction of separated representations [5,6,7], practical multiresolution algo- 
rithms in higher dimensions [8,9,10,11] became available as well. In this paper 
we present a new fast, adaptive algorithm for applying a class of integral op- 
erators with radial kernels in dimensions d = 1,2, 3, and we briefly discuss its 
extension to higher dimensions. 

In physics, chemistry and other applied fields, many important problems may 
be formulated using integral equations, typically involving Green's functions 
as their kernels. Often such formulations are preferable to those via partial 
differential equations (PDEs). For example, evaluating the integral expressing 
the solution of the Poisson equation in free space (the convolution of the 
Green's function with the mass or charge density) avoids issues associated with 
the high condition number of a PDE formulation. Integral operators appear 
in fields as diverse as electrostatics, quantum chemistry, fluid dynamics and 
geodesy; in all such applications fast and accurate methods for evaluating 
operators on functions are needed. 

The FMM and our approach both employ approximate representations of 
operators to yield fast algorithms. The main difference lies in the type of 
approximations that are used. For example, for the Poisson kernel 1/r in 
dimension d = 3, the FMM [3] uses a plane wave approximation starting from 
the integral 



where r = J (x — xo) 2 + (y — yo) 2 + [z — z ) 2 . The elegant approximation de- 
rived from this integral in [3] is valid within a solid angle, and thus requires 
splitting the application of an operator into directional regions; the number 
of such regions grows exponentially with dimension. For the same kernel, our 
approach starts with the integral 



and its discretization with finite accuracy e yields a spherically symmetric 
approximation as a weighted sum of gaussians. Other radial kernels can be 
similarly treated by a suitable choice of integrals. The result is a separated 
representation of kernels and, therefore, an immediate reduction in the cost 
of their application. This difference in the choice of approximation dictates 
the differences in the corresponding algorithms. In dimension d < 3 both ap- 
proaches are practical and yield comparable performance. The key advantage 
of our approach is its straightforward extensibility to higher dimensions [6,7]. 

Given an arbitrary accuracy e, we effectively represent kernels by a set of ex- 
ponents and weights describing the terms of the gaussian approximation of 
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integrals like in (2). The number of terms in such sum is roughly proportional 
to log(e _1 ), or a low power of log(e _1 ), depending on the operator. Since op- 
erators are fully described up to an accuracy e by the exponents and weights 
of the sum of gaussians, a single algorithm applies all such operators. These 
include operators such as (—A + n 2 I)~ a , where ji > and < a < 3/2, 
and certain singular operators such as the projector on divergence-free func- 
tions. Since many physically significant operators depend only on the distance 
between interacting objects, our approach is directly applicable to problems 
involving a wide class of operators with radial kernels. 

We combine separated and multiresolution representations of kernels and use 
multiwavelet bases [12] that provide inter alia a method for discretizing inte- 
gral equations, as is the case in quantum chemistry [8,9,11,10]. This choice of 
multiresolution bases accommodates integral and differential operators as well 
as a wide variety of boundary conditions, without degrading the order of the 
method [13,14]. Multiwavelet bases retain the key desired properties of wavelet 
bases, such as vanishing moments, orthogonality, and compact support. Due to 
the vanishing moments, wide classes of integro-differential operators have an 
effectively sparse matrix representation, i.e., they differ from a sparse matrix 
by an operator with small norm. Some of the basis functions of multiwavelet 
bases are discontinuous, similar to those of the Haar basis and in contrast to 
wavelets with regularity (see e.g. [15,16]). The usual choices of scaling func- 
tions for multiwavelet bases are either the scaled Legendre or interpolating 
polynomials. Since these are also used in the discontinuous Galerkin and dis- 
continuous spectral elements methods, our approach may also be seen as an 
adaptive extension of these methods. 

The algorithm for applying an operator to a function starts with computing 
its adaptive representation in a multiwavelet basis, resulting in a 2 d -tree with 
blocks of coefficients at the leaves. Then the algorithm adaptively applies the 
(modified) separated non-standard form [4] of the operator to the function by 
using only the necessary blocks as dictated by the function's tree represen- 
tation. We note that in higher dimensions, d ^> 3, functions need to be in 
a separated representation as well, since the usual constructions via bases or 
grids are prohibitive (see [6,7]). 

We start in Section 2 by recalling the basic notions of multiresolution analy- 
sis, non-standard operator form and adaptive representation of functions un- 
derlying our development. We then consider the separated representation for 
radially symmetric kernels in Section 3, and use it to efficiently extend the 
modified ns-form to multiple spatial dimensions in Section 4. We pay partic- 
ular attention to computing the band structure of the operator based on one 
dimensional information. We use this construction in Section 5 to introduce 
the adaptive algorithm for application of multidimensional operators in the 
modified ns-form, and illustrate its performance in Section 6. We consider 
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two examples: the Poisson equation in free space and the ground state of the 
Hydrogen atom. We conclude with a brief discussion in Section 7. 



2 Preliminary considerations 

This section and Appendix are provided for the convenience of the reader in 
order to keep this paper reasonably self-contained. We provide background 
material and introduce necessary notation. 

The essence of our approach is to decompose the operator using projectors 
on a Multiresolution Analysis (MRA), and to efficiently apply its projections 
using a separated representation. We use the decomposition of the operator 
into the ns-form [4], but we organize it differently (thus, modified ns-form) to 
achieve greater efficiency. This modification becomes important as we extend 
this algorithm to higher dimensions. 

In this section we introduce notation for MRA, describe the adaptive repre- 
sentation of functions and associated data structures, introduce the modified 
ns-form and an algorithm for its adaptive application in dimension d = 1 as 
background material for the multidimensional case. 

2.1 Multiresolution analysis 

Let us consider the multiresolution analysis as a decomposition of L 2 ([0, l] d ) 
into a chain of subspaces 

v C V a C V 2 C ■ ■ ■ c v n c . . . , 

so that L 2 ([0, l] d ) = WjL Vj. We note that our indexing of subspaces (increas- 
ing towards finer scales) follows that in [13], and is the reverse of that in [4,15]. 
On each subspace Vj, we use the tensor product basis of scaling functions ob- 
tained using the functions <fi J kl (x) (k — 0, . . . ,p — 1) which we briefly describe 
in Appendix. 

The wavelet subspaces Wj are defined as the orthogonal complements of Vj 
in Vj_i, thus 

V n = V ®] =0 Wj . 

Introducing the orthogonal projector on Vj, Pj : L 2 ([0, l] d ) — > Vj and con- 
sidering an operator T : £ 2 ([0, l] d ) —> L 2 ([0, l] d ), we define its projection 
Tj : Vj — > Vj as Tj = PjTPj. We also consider the orthogonal projector 
Qj : L 2 ([0, l] d ) -> Wj, defined as Qj = P i+1 - Pj. 
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2.2 Adaptive representation of functions 



Let us describe an adaptive refinement strategy for construction multiresolu- 
tion representations of functions / : B —> B, where B = [0, l] d . We proceed 
by recursive binary subdivision of the box B, so the basic structure represent- 
ing our functions is a 2 d — tree with arrays of coefficients stored at the leaves 
(terminal nodes) and no data stored on internal nodes. On each box obtained 
via this subdivision, our basis is a tensor product of orthogonal polynomials of 
degree k — 0, . . . ,p—l in each variable, as described in Appendix 8. Therefore, 
the leaves carry <i-dimensional arrays of p d coefficients which may be used to 
approximate function values anywhere in the box corresponding to the spatial 
region covered by it, via (30) or its equivalent for higher values of d. For con- 
ciseness, we will often refer to these d-dimensional arrays of coefficients stored 
at tree nodes as function blocks. 

This adaptive function decomposition algorithm is similar to that used in [17]. 
Such construction formally works in any dimension d. However, since its com- 
plexity scales exponentially with d, its practical use is restricted to fairly low 
dimension, e.g. d < 4. In higher dimensions, alternate representation strate- 
gies for functions such as [6,7] should be considered. In high dimensions, these 
strategies deal with the exponential growth of complexity by using controlled 
approximations that have linear cost in d. 

For simplicity, we will describe the procedure for the one-dimensional case 
since the extension to dimensions d = 2, 3 is straightforward. Since we can 
not afford to construct our representation by starting from a fine scale (es- 
pecially in d = 3), we proceed by successive refinements of an initial coarse 
sampling. This approach may result in a situation where the initial sampling 
is insufficient to resolve a rapid change in a small volume; however, in practical 
applications such situations are rare and may be avoided by an appropriate 
choice of the initial sampling scale. 

Let B{ = [2~H, + 1)], I = 0, . . . , 2 J ' — 1, represent a binary subinterval on 
scale j. We denote by // = {fi( x k)Y k the vector of values of the function 
/ on the Gaussian nodes in B\. From these values we compute the coeffi- 
cients ( see (32) in Appendix) and interpolate f(x) for any x G B\ 
by using (30). We then subdivide B{ into two child intervals, B^ 1 and B^, 
and evaluate the function / on the Gaussian nodes in B^ 1 and -B^i+i- We 
then interpolate / by using the coefficients ( S L}^_ from their parent inter- 
val and denote by and f^i+i the vectors of interpolated values on the 
two subintervals. Now, if for a given tolerance e either f^ 1 — f? + 



I 21 



> e or 



fii+i ~ fii+i > e 5 we re P ea t the process recursively for both subintervals, 
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£>^ +1 and i^'l+i; otherwise, we keep the coefficients {sjw} to represent the 

function on the entire interval B\. This interval then becomes a leaf in our 
tree. 



At this stage we use the £°° norm, thus constructing an approximation / to 



the original function / such that 



< e. 



which immediately implies 



that 



/-/ 

< e. This estimate clearly extends to any dimension. Once 
the approximation with £°° norm is constructed, the corresponding tree may 
be pruned if an application only requires the approximation to be valid in 
the £ 2 norm. We start this process on the finest scale and simply remove all 
blocks whose cumulative contribution is below e. Other norms, such as Hi, 
can be accommodated by appropriately weighing the error tolerance with a 
scale-dependent factor in the initial (coarse to fine) decomposition process. 



The complete decomposition algorithm proceeds by following the above recipe, 
starting with an initial coarse scale (typically j = 0) and continuing recursively 
until the stopping criterion is met for all subintervals. In practice, we choose a 
stopping scale j max , beyond which the algorithm will not attempt to subdivide 
any further. Reaching j max means that the function has significant variations 
which are not accurately resolved over an interval of width 2 _Jmax using a basis 
of order p. A pseudo-code listing of this process is presented as Algorithm 1. 



Algorithm 1 Adaptive Function Decomposition. 



Start at a coarse scale, typically j = 0. 
Recursively, for all boxes V on scale j, proceed as follows: 
Construct the list C of 2 d child boxes V +l on scale j + 1. 
Compute the values of the function f{V) at the p d Gauss-Legendre quadra- 
ture nodes in b° . 
for all boxes V +1 e C do 

From f{V), interpolate to the Gauss-Legendre quadrature nodes in hP +1 , 

producing values f{b> +1 ). 

Compute the values of f{V +1 ) at the Gauss-Legendre nodes of V +l , by 
direct evaluation. 



if 



f(V +1 ) - f(V +1 ) 



> e then 

Recursively repeat the entire process for all boxes V +1 G C. 
end if 
end for 

# Geting here means that the interpolation from the parent was successful 
for all child boxes. We store the parent's coefficients from V in the function 
tree. 
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(a) Function. 
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(b) Pointwise error. 
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(c) Adaptive tree 



(d) Redundant tree 



Figure 1. The function f(x) = sin(167rx 6 ) shown in (a), is decomposed with p = 8 
and e = 10 -4 ; (b) shows the pointwise approximation error, (c) is the resulting adap- 
tive tree, where smaller subdivisions are required in regions with higher frequency 
content, (d) is the redundant tree associated with this adaptive decomposition, where 
all internal nodes have been filled with data. 

2.2.1 Tree structures for representing functions 

The decomposition Algorithm 1 naturally leads to a tree data structure to 
represent functions, with the leaves of the tree corresponding to the spatial 
intervals over which the multiwavelet basis provides a sufficiently accurate 
local approximation. By using (30) or its higher-dimensional extensions, the 
only data needed to approximate f(x) anywhere in B is the array of basis 
coefficients on these leaves. Thus we use a tree structure where the leaves 
store these coefficients and the internal nodes do not contain any data (and 
are effectively removed since we use hash tables for storage). We will refer 
to this structure as an adaptive tree. Each level in the tree corresponds to a 
scale in the MRA, with the root node corresponding to the coarsest projection 

fo e v . 



Now, in order to apply the modified non-standard form of an operator to a 



7 



function, we will show in the next section that we also need the basis coeffi- 
cients corresponding to internal nodes of the tree. Hence, from the adaptive 
tree data structure we will compute a similar tree but where we do keep the 
coefficients of scaling functions on all nodes (leaves and internal). 

The coefficients on the internal nodes are redundant since they are computed 
from the function blocks stored in the leaves. We will thus refer to the tree 
containing coefficients on all nodes as a redundant tree. It is constructed re- 
cursively starting from the leaves, by projecting the scaling coefficients from 
all sibling nodes onto their parent node, using the decomposition (41). 

Figure 1 shows both the adaptive and the redundant trees for a sample func- 
tion. This figure displays the coarsest scales at the bottom and progressively 
finer ones further up, with filled boxes representing nodes where scaling co- 
efficients are stored and empty boxes indicating nodes with no data in them 
(these do not need to be actually stored in the implementation). 

2.3 Modified ns-form 

The non-standard form [4] (see also [13] for the version specialized for multi- 
wavelets) of the operator T is the collection of components of the telescopic 
expansion 

n-l 

T n = (T n - T n _i) + (T n _ a - T„_ 2 ) + • ■ ■ + T = T + £ (A, + B, + C,), (3) 

j=0 

where Aj = QjTQj, Bj = QjTPj, and Cj = PjTQj. The main property 
of this expansion is that the rate of decay of the matrix elements of the op- 
erators Aj, Bj and Cj away from the diagonal is controlled by the number 
of vanishing moments of the basis and, for a finite but arbitrary accuracy e, 
the matrix elements outside a certain band can be set to zero resulting in an 
error of the norm less than e. Such behavior of the matrix elements becomes 
clear if we observe that the derivatives of kernels of Calderon-Zygmund and 
pseudo-differential operators decay faster than the kernel itself. If we use the 
Taylor expansion of the kernel to estimate the matrix elements away from the 
diagonal, then the size of these elements is controlled by a high derivative of 
the kernel since the vanishing moments eliminate the lower order terms [4]. 
We note that for periodic kernels the band is measured as a periodic distance 
from the diagonal, resulting in filled-in 'corners' of a matrix representation. 

Let us introduce notation to show how the telescopic expansion (3) is used 
when applying an operator to a function. If we apply the projection of the 
operator Tj_i not on its "natural scale" j — 1, but on the finer scale j, we 
denote its upsampled version as j (Tj_i). In the matrix representation of 
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Tj-i, this operation results in the doubling of the matrix size in each direction. 
This upsampling j (•) and downsampling j (•) notation will also be used for 
projections of functions. 

With this notation, computing g = Tf via (3) splits across scales, 

9o = T /o 

h = [Ti- T (To)]/! 

g 2 = [T 2 - T (TOl/a (4) 



where f j = ~P j f. 

As in the application of the usual ns-form in [4], to obtain g n after building 
the set {go, gi, ■ ■ ■ , gn}, we have to compute 

g n = g n + T (g n -x+ T (g n ~2+ T {g n -z + ■■■ + (} go) ■■ ■))) ■ (5) 

The order of the parentheses in this expression is essential, as it indicates the 
order of the actual operations which are performed starting on the coarsest 
subspace V . For example, if the number of scales n = 4, then (5) yields 
#4 = <74+ T (<?3+ T (<?2+ T (<7i + (T §0)))), describing the sequence of necessary 
operations. 

Unfortunately, the sparsity of the non-standard form induced by the vanishing 
moments of bases is not sufficient for fast practical algorithms in dimensions 
other than d — 1. For algorithms in higher dimensions, we need an additional 
structure for the remaining non-zero coefficients of the representation. We will 
use separated representations (see Section 3) introduced in [6,18] and first 
applied in a multiresolution setting in [8,9,10,11]. Within the retained bands, 
the components of the non-standard form are stored and applied in a separated 
representation and, as a result, the numerical application of operators becomes 
efficient in higher dimensions. 



2.4 Modified ns-form in ID 

Let us describe a one-dimensional construction for operators on L 2 ([0, 1]) to 
introduce all the features necessary for a multidimensional algorithm. Since 
we use banded versions of operators, we need to introduce the necessary book- 
keeping. 
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The "template" for the band structure on scale j comes from the band on the 
previous scale j — 1. For each block on scale j — 1, the upsampling operation 
\ {Tj^i) creates four blocks (all combinations of even/odd row and column 
indices). We insist on maintaining the strict correspondence between these four 
blocks and those of Tj. For this reason the description of the retained blocks 
of Tj involves the parity of their row and column indices. Let us denote the 
blocks in the matrix representing Tj by P ]U , where I, I' — . . . 2 J '~ 1 . Individual 
elements within these blocks are indexed as t{f, 1 , where i, %' — 0, . . . ,p — 1, and 
p is the order of the multiwavelet basis. For a given width of the band bj, we 
keep the operator blocks P ]U with indices satisfying 

I — bj + 1 < I' < I + bj, for even Z, 

(6) 

l-bj < I' < l + bj - 1, for odd I. 

We denote the banded operators where we keep only blocks satisfying (6) as 
T h j and | (Tj_i) bj . If we downsample the operator j (T^!) 6j back to its 
original scale j — 1, then (6) leads to the band described by the condition 

I - [bj/2\ </'</+ L&J-/2J , (7) 

where L^'/2J denotes the integer part ofbj/2. We denote the banded operator 
on scale j — 1 as T J -j'{ 2 ', where we retain blocks satisfying (7). 

If we now rewrite (4) keeping only blocks within the bands on each scale, we 
obtain 

90 = T /o 

91 = [T* 1 - T (T ) 61 ]/! = T^h- T (T )Vi 

g 2 = [Tg 2 — T (T!) fe ]/ 2 = T 6 2 2 / 2 - T (Ti) b2 /2 (8) 
gj = [T^- T (Tj-^fj = T h ?fj- T (Tj^fj 



For any arbitrary but finite accuracy, instead of applying the full [Tj— 
(Tj_i)], we will only apply its banded approximation. 

A simple but important observation is that 

I ([T ( T i-i)]/i) = T i-i/i-i , (9) 

which follows from the fact that QjPj = PjQj = 0, since these are orthogonal 
projections. Thus, we observe that j. (f (Tj-i) bj ) = T^^] so instead of ap- 
plying j (Tj_i) bj /j on scale j, we can obtain the same result using (9), so that 
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Figure 2. Modified non-standard form of the convolution operator in (11) in a mul- 
tiwavelet basis, with white representing and black representing large values. The 
top matrix is the projection of this operator on V5, resulting in a dense matrix. The 
lower half depicts the multiresolution representation in (10) with only the blocks 
that are actually retained for a given accuracy. We will call the leftmost matrix in 
this series a whole band matrix and all others outer band matrices. The two empty 
outer band matrices on scales j = 0, 1 are explained in the main text. 

t (T^ 2j /i-i) =T (T> ,)'••/,. Therefore, we will compute only t}j?( 2J on 

scale j — 1 and combine it with computing T^lfj-i- Incorporating this into 
(5), we arrive at 



g n = T<£f n + T [(Ttn 1 - Ttl 2i ) U 



(10) 



,|&n-l/2j 



fn-2 + • • • + 



T [(t c 



1L61/2J 



fo 



Using this expression yields an efficient algorithm for applying an operator, 
as on each scale j, — Tj 6j+1 ^ is a sparse object, due to the cancellation 

which occurs for most of the blocks. In particular, T^ 3 — T^ J+i ^ 2 ' is missing 
the blocks near the diagonal, and we will refer to it as an outer band matrix. 
We will call Ty a whole band matrix as it contains both the inner and outer 
bands. 
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The structure of these matrices is illustrated in Figure 2. The two empty 
scales j = 0, 1 arise due to the complete cancellation of blocks on these scales. 
We note that the modified non-standard form is constructed adaptively in 
the number of scales necessary for a given function. For just two scales, this 
construction will have the scale j = 1 non-empty. Given an adaptive decom- 
position of a function on n scales, we precompute the modified non-standard 
form (depicted in Figure 2 for five scales) on all scales n,n — 1, . . . , 1. For 
matrices requiring 2 n ■ 2 n blocks on the finest scale n, we need to keep and 
apply only 0(2 n ) blocks, as with the original non-standard form in [4]. 



2.4-1 Adaptive application 

Let us show how to use the multiscale representation in (10) to apply the 
operator T to a function / with controlled accuracy e. We describe an adaptive 
application of the operator to a function, where we assume (as is often the 
case) that the tree structure of the input is sufficient to adequately describe 
the output with accuracy e. This assumption will be removed later. 

Our algorithm uses the structure shown in Figure 2 in an adaptive fashion. 
We copy the structure of the redundant tree for the input function, and use 
that as a template to be filled for the output g. For each node of the output 
tree we determine whether it is a leaf or an internal node: for leaves, we must 
apply a whole band matrix on the scale of that node, such as the leftmost 
matrix for j = 5 in the example shown in Figure 2. For internal nodes, we 
apply an outer band matrix (for that scale). We note that our construction 
of the operator produces both whole and outer band matrices for all scales, 
and we simply choose the appropriate kind for each node of output as needed. 
Upon completion of this process, we apply the projection (5) to construct the 
final adaptive tree representing the output. 

Algorithm 2 returns an adaptive tree representing the function g. This tree 
contains sufficient information to evaluate g at arbitrary points by interpola- 
tion and may be used as an input in further computations. 

We note that Step 2 in Algorithm 2 naturally resolves the problem that is 
usually addressed by mortar methods, see e.g. [19,20,21,22]. Since adaptive 
representations have neighboring blocks of different sizes, they encounter dif- 
ficulties when applying non-diagonal operators, as they require blocks which 
do not exist on that scale. Our approach simply constructs these as needed 
and caches them for reuse, without requiring any additional consideration on 
the part of the user. 
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Algorithm 2 Adaptive non-standard form operator application in d = 1, 
9 = Tf 

Initialization: Construct the redundant tree for / and copy it as skeleton 

tree for g (see Section 2.2.1). 

for all scales j = 0, . . . , n — 1 do 

for all function blocks gj in the tree for g at scale j do 

# Step 1. Determine the list of all contributing blocks of the modified 
ns-form T\ v (see Section 2.3): 

if gj belongs to a leaf then 

Read operator blocks T J U , from row / of whole band matrix . 
else 

Read operator blocks T\ v from row I of outer band matrix Tf — 

j 

end if 

# Step 2. Find the required blocks f 3 v of the input function f : 
if function block f(, exists in the redundant tree for / then 

Retrieve it. 
else 

Create it by interpolating from a coarser scale and cache for reuse, 
end if 

# Step 3. Output function block computation: 

Compute the resulting output function block according to gj = 
J2i> T J u ,fi,, where the operation T\ v fl indicates a regular matrix- vector 
multiplication, 
end for 
end for 

# Step 4- Adaptive projection: 

Project resulting output function blocks gj on all scales into a proper adap- 
tive tree by using Eq. (5). 

Discard from the resulting tree unnecessary function blocks at the requested 
accuracy. 

Return: the function g represented by its adaptive tree. 



2.4-2 Numerical example 

Let us briefly illustrate the application of the modified ns-form with an exam- 
ple of a singular convolution on the unit circle, the operator with the kernel 

K(x) = cot(nx), 

(Cf)(y) = p.v. f 1 cot(7r(y - x)) f{x) dx, (11) 
Jo 

a periodic analogue of the Hilbert transform. In order to find its representation 
in multiwavelet bases, we compute 
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rijf = 2~ j ^ K(2- j (x + l))<& iV {x)dx = T j f cot(7r2- J '(x + /)) <$> iV {x) dx , 

(12) 

where <£v(x), = 0, . . . , k — 1 are cross-correlation functions described 
in Appendix 8.4 and I = 0, ±1, ±2, . . . 2 J — 1. We compute r--/ using the 
convergent integrals 

t# = 2"' E 4 /' $+ (x) (cot(7r2^(x + 0) + (-l) W 'cot(7r2-^-x + 0)) dx, 

where $^" is a polynomial described in Appendix 8.4. In our numerical ex- 
periment, we apply (11) to the periodic function on [0,1], 

/( I ) = ^ e - fl ^- 1 / 2)2 ) 

which yields 

(Cf)(y) = - E e- afe+fc - 1/2)2 Erfi[^(y+fc-l/2)] = £ sign^e^/V 7 ^, 

2 2 (13) 
where e v Erfi(y) = /q e s a ds. Expression (13) is obtained by first observ- 
ing that the Hilbert transform of e~ ax2 is — e~ ay2 Erfi(y/ay) , and then evaluating 
the lattice sum, noting that (see [23, formula 4.3.91]) 

Table 1 summarizes the numerical construction of this solution for a = 300, 
at various requested precisions. Optimal performance is obtained by adjusting 
the order of the basis p as a function of the requested precision, to ensure that 
the operator remains a banded matrix with small band, and that the adaptive 
representation of the input function requires a moderate number of scales. 
The resulting numerical error (as compared to the exact analytical solution), 
measured in the £ 2 norm, is shown in the last column. Figure 3 shows the 
input and results for this example, as well as the point- wise error for the case 
where e req = 10~ 12 and p = 14 (the last row in the table). 



3 Separated representations of integral kernels 

The approach we've discussed so far does not efficiently generalize to the 
application of non-separable multidimensional integral kernels. Since several 
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Table 1 

Results from evaluating (13) with our algorithm. The order of the basis p is adjusted 
as a function of the requested precision e. The second column indicates scales present 
in the adaptive tree for the input. The third column shows the total number of 
blocks of coefficients in this tree. The last column (E2) shows the actual error of the 
computed solution in the £ 2 norm. 
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Figure 3. Results of applying the cotangent kernel to a periodized Gaussian using 
basis of order p = 14 (the last row in Table 1). The pointwise error is shown on the 
right for a requested accuracy of e = 1CT 12 . 

physically important kernels belong to this category (e.g. the Poisson kernels 
in d = 2 and d=3), additional tools are needed. We now describe the key idea 
that allows us to perform this generalization to d > 1. 



We use the separated representation of operators introduced in [6,24] to re- 
duce the computational cost of the straightforward generalization of the mul- 
tiresolution approach in [4]. Such representations are particularly simple for 
convolution operators and are based on approximating kernels by a sum of 
Gaussians [24,8,9,11,10]. This approximation has a multiresolution character 
by itself and requires a remarkably small number of terms. In fact, our algo- 
rithm uses the coefficients and the exponents of the Gaussian terms as the only 
input from which it selects the necessary terms, scale by scale, according the 
desired accuracy threshold e. Therefore, our algorithm works for all operators 
with kernels that admit approximation by a sum of Gaussians. Examples of 
such operators include the Poisson and the bound state Helmholtz Green's 
functions, the projector on divergence free functions, as well as regular and 
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fractional derivative operators. Let us consider a particular family of operators 
(—A + ji 2 I)~ a , where n > and < a < 3/2. The kernel of this operator 



where K|_ Q is the modified Bessel function, r = ||x — y|| and C a = 2 ■ 
2 _2a 7r~i/r(a), has an integral representation 

K^{r) = C a r e- r2e2s e-^ 2e ~ 2s+ ^- 2a >ds. (14) 

J — oo 

Using the trapezoidal rule, we construct an approximation valid over a range 
of values 5 < r < R with accuracy e, of the form 



/ N ^ -r / ^ T(3/2 - «) ■ C, 



m=l 



< ^oAr) = 6 MO/ ;_3_^ (15) 



2H 



where r m = e 2s ™, u; m = /iC Q e-" 2e_2sm / 4+ ( 3 - 2a > s ™, h = (B - A)/M and s m = 
A + mh. The limits of integration, A, B and the step size h are selected as 
indicated in [24], where it is shown that for a fixed accuracy e the number of 
terms M in (15) is proportional to log(i?5 _1 ). Although it is possible to select 
S and R following the estimates in [25] and optimize the number of terms for 
a desired accuracy e, in this paper we start with an approximation that has an 
obviously excessive range of validity and thus, an excessive number of terms. 

An example of such approximation is shown in Figure 4. For a requested toler- 
ance of e = 1CT 10 , roughly 300 terms are enough to provide a valid approxima- 
tion over a range of 15 decades. We then let the algorithm choose the necessary 
terms, scale by scale, to satisfy the user-supplied accuracy requirement e. This 
approach may end up with a few extra terms on some scales in comparison 
with that using a nearly optimal number of terms [8,10,11,9]. Whereas the 
cost of applying a few extra terms is negligible, we gain significantly in having 
a much more flexible and general algorithm. 

We note that approximation in (15) clearly reduces the problem of applying 
the operator to that of applying a sequence of Gauss transforms [26,27], one 
by one. From this point of view, the algorithm that we present may be con- 
sidered as a procedure for applying a linear combination of Gauss transforms 
simultaneously. 

In order to represent the kernel K of the operator in multiwavelet bases, we 
need to compute the integrals, 
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Figure 4. Relative error of the Gaussian approximation for the Poisson kernel in 3 
dimensions. This unoptimized expansion uses 299 terms to cover a dynamic range 
of roughly 15 decades with e = 10~ 10 relative accuracy. 



m=l 

(16) 

where $u'(x) are the cross-correlations of the scaling functions (see Appendix). 
We obtain 



M 



l l l l' l 2*2>*3 l 3 ' "" 1*1 '2'2 '3'3 

m=l 



where 

l r 1 



E - h 'Xt 'Xt ■ (l7) 

n=l 

i$ m,i = ^ e^ x+l ^^^(x)dx. (18) 



Since the Gaussian kernel is not homogeneous, we have to compute integrals 
(18) for each scale. Although in principle Z e Z, in the next section we explain 
how to restrict it to a limited range on each scale, for a given accuracy e. 



4 Modified ns-form of a multidimensional operator 



In this section, we describe how the separated representation approximations 
of Section 3 can be used to construct a multidimensional extension of the ns- 
form representation from Section 2.3, using only one-dimensional quantities 
and norm estimates. This makes our approach viable for d > 1. We use the 
modified ns-form as in the one-dimensional case described in Section 2.3. We 
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find the ns-form essential for adaptive algorithms in more than one dimension, 
since: 



(1) Scales do not interact as the operator is applied. All interactions between 
scales are accounted for by the (inexpensive) projection at the final step 
of the algorithm. 

(2) For the same reason, the subdivision of space at different scales natu- 
rally maps into the supporting data structures. We note that one of the 
main difficulties in developing adaptive algorithms is in organizing com- 
putations with blocks of an adaptive decomposition of a function from 
different scales but with a common boundary. The methods for such com- 
putations are known as mortar elements methods. In our approach this 
issue does not present any obstacle, as all relevant interactions are natu- 
rally accounted for by the data structures. 

The key feature that makes our approach efficient in dimensions d > 2 is 
the separated structure of the modified ns-form. Namely, the blocks of Tj— T 
(Tj_i) are of the form (for d = 3) 



M 

Tj- T (Tj_!) = w m F j > mh F j ' ml *F j > mh 

m=l 

- t ( w m F j - v ' mll F j - 1 ' mh F j - 1 ' ml ^\ 

\m=l / 

M 

= ]T Wm F^ mh F jimh F j ' ml3 (19) 
i 

M 

w m T (F^ 1;mil )- t (F j - V > rnh )- t (F j - V ' mh ). 



m=l 
M 



m=l 



As in the case d — 1, the norm of the operator blocks of Tj— f (T^_ 1 ) decays 
rapidly with \\£\\, £ = (Zi, l 2 , h), and the rate of decay depends on the number 
of vanishing moments of the basis [4]. Moreover, we limit the range of shift 
indices \\£\\ using only one-dimensional estimates of the differences 

F j,m,2l_ | F j;m,l ^ pj^l+l _ | F y, m ,l ^ 

of operator blocks computed via (18). The norms of individual blocks F^,™' 1 
are illustrated in Figure 5 (a), for scale j = 1. 

By selecting the number of vanishing moments for a given accuracy, it is suffi- 
cient to use H^Hoo < 2 in practical applications that we have encountered. Also, 
not all terms in the Gaussian expansion of an operator need to be included 
since, depending on the scale j, their contribution may be negligible for a 
given accuracy, as shown in Figure 5 (b) . Below we detail how we select terms 
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of the Gaussian expansion on a given scale as well as the significant blocks of 
T £ j— t (Tj-i)- This procedure establishes the band structure of the operator. 

We then project the banded operator j (Tj-i) back to the scale j — 1 and 
then combine blocks on the natural scale for each projection in order to apply 
the operator efficiently as was explained in Section 2.3 for the one-dimensional 
case. 

We note that in deciding which terms to keep in (19), we do not compute the 
difference between the full three dimensional blocks as it would carry a high 
computational cost; instead we use estimates based on the one dimensional 
blocks of the separated representation. We note that since the resulting band 
structure depends only on the operator and the desired accuracy of its ap- 
proximation, one of the options is to store such band information as it is likely 
to be reused. 

In order to efficiently identify the significant blocks in T*- j as a 

function of £, we develop norm estimates based only on the one-dimensional 
blocks. The difference between two terms of the separated representation, say 
F±F 2 F 3 — G\G 2 G 3 , may be written as 

FiF 2 F 3 — GiG 2 G 3 = (Fx — Gi)F 2 F 3 + G 1 (F 2 — G 2 )F 3 + GiG 2 (F 3 — G 3 ). 

We average six different combinations of the three terms to include all direc- 
tions in a symmetric manner, which results in the norm estimate 

HFiFaFg - G 1 G 2 G 3 \\ < ^sym ^\F X - G x \\ \\F 2 \\ \\F 3 \\+ (21) 

II GiU || F 2 — G 2 \\ \\F 3 \\ + || Gi|| \\G 2 \\ \\F 3 — G 3 \\] , 

where the symmetrization is over the three directions and generates 18 terms. 
For the rotationally symmetric operators with Gaussian expansion as in (15) 
computing the right hand side in this estimate involves just three types of one 
dimensional blocks and their norms, 





pj;m;l_ 




N Jjm;l = 


pj;m;l 


• 




AT j;m;l _ 

JV TF — 


T (F j - 


-l;m;Z^ 


) 



where index j indicates the scale, m the term in the Gaussian expansion (15), 
and / the position of the block in a given direction. 

These estimates allow us to discard blocks whose norm falls below a given 
threshold of accuracy, namely, for each multi-index £, we estimate 
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(a) Norms of each one-dimensional 
block computed via (18) for scale j = 1, 
as a function of the index m in the sep- 
arated representation. 
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(b) Norm estimates (23) for scale j = 1 
as a function of the index m in the sep- 
arated representation. Based on these 
estimates, only terms above the cutoff 
are actually applied. 



Figure 5. Comparison of norms of matrix blocks generated by the Gaussian ap- 
proximation for the Poisson kernel in dimension d = 3. In each picture, the curves 
correspond to the different offsets I for which blocks are generated. Figure (b) illus- 
trates the estimate in (23) (see main text for details). 



M 



Tj-tCTj.!) <-$> m sym 



N^ h Np m ' h Np m ' l3 + 



Q m=l 



(23) 



For each scale j and each block T^— j (Ty-i) labeled by the multi-index 
^ — (h, h, h)i we compute all terms of the sum in (23) and identify the range 
[m 1 ,m 2 ] which we need to keep for that block, by discarding from the sum 
terms whose cumulative contribution is below e. If the entire sum falls below e, 
this range may be empty and the entire Tj — j (Tj_i) is discarded. The range 
[mi,m 2 ] differs significantly depending whether or not the block is affected 
by the singularity of the kernel as is illustrated in Figure 5. In Figure 5 (a) 
the rate of decay for the blocks with shift \l\ = 2, 3 is significantly faster than 
for the blocks with \l\ < 1 affected by the singularity. We note that all blocks 
of the first 150 terms in the separated representation (17) have norm 1 (and 
rank 1 as matrices) and are not shown in Figure 5 (a). 

Since the difference in (23) involves blocks upsampled from a coarser scale, all 
shifts \l\ < 3 are affected by the singularity. Figure 5 (b) shows the r.h.s. of 
the estimate in (23) for different shifts along one of the directions, where the 
blocks along the other two directions are estimated by the maximum norm 
over all possible shifts. 

After discarding blocks with norms less that e using the estimate in (23), we 
downsample the remaining blocks of f (Tj_ x ) back to the original scale. This 
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leaves only blocks of Tj on the scale j and we remove additional blocks of 
T £ j for the shifts |/| = 2,3 where the decay is sufficiently fast to make their 
contribution less than e. 

This leads us to arrange the blocks on each scale into several subsets by the 
effect the singularity of the kernel has on them and find the appropriate range 
[mi, 7712] for each subset. There are three such sets in dimension d = 2 and 
four sets in dimension d = 3. For each index /, we will say that the index 
belongs to the core if I = —1,0,1 and to the boundary otherwise. The core 
indices correspond to one-dimensional blocks whose defining integrals include 
the singularity of the kernel. We then divide all possible values of the multi- 
index £ = (h,l2,h), according to the number of core indices it has. In d = 3 
this gives us four sets: 

• Core: all indices (li, Z 2 , h) belong to the core. 

• Boundary- 1: two of the indices belong to the core and one to the boundary 

• Boundary-2: one of the indices belongs to the core, the other two to the 
boundary 

• Boundary-3: all indices (h,l2,h) belong to the boundary. 

We then find the range [m 1 ,m 2 ] for each subset and apply blocks of each 
subset separately, thus avoiding unnecessary computations with blocks whose 
contribution is negligible. This range analysis only needs to be done once per 
operator and the desired accuracy, and the results may be saved for repeated 
use. 



5 Multidimensional adaptive application of ns-form 



In this section we present an algorithm for applying the modified non-standard 
form which is an extension of (2) (based on (8) and (10)) to higher dimensions. 
We are now seeking to compute 

g(x) = [T/](x) = J K(y - x)/(y)dy, 

where x,y e R d for d = 2,3. The separated approximation (19) reduces the 
complexity of applying the operator by allowing partial factorization of the 
nested loops in each scale indicated by the order of summation and illustrated 
for d = 3, 
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gj;hhh = ^2 Wm J pr>m;h-li pj;m,h-l' 2 pj;m,l 3 -l' 3 _ 

71 

T 



'1 '2 '3 



^j'-l;m;£i-ii pj-l;rn,l 2 -l' 2 pj-l;m,l 3 -l' 3 
. ''1 '2 '3 



As described in the previous section, this evaluation is done by regions of in- 
dices. These regions of indices are organized so that (for a given accuracy) the 
number of retained terms of the separated representation is roughly the same 
for all blocks within each region. Thus, we perform the summation over the 
terms of the separated representation last, applying only the terms that actu- 
ally contribute to the result above the requested accuracy threshold, according 
to estimate (23). Therefore, we avoid introducing checks per individual block 
and the resulting loss of performance. 

Just as in the one-dimensional case, we use (19) in a 'natural scale' manner. 
That is, blocks belonging to scale j are only applied on that scale. As in one- 
dimensional case, the interaction between scales is achieved by the projection 
(10) that redistributes blocks accumulated in this manner properly between 
the scales to obtain the adaptive tree for the resulting function. The overall 
approach is the same as described in (2). We note that, as expected, the sep- 
arated representation requires more detailed bookkeeping when constructing 
the data structures for the operator. 



Remark 1 Our multiresolution decomposition corresponds to the geometri- 
cally varying refinement in finite element methods. In this case the adjoining 
boxes do not necessarily share common vertices, forming what corresponds to 
the so-called non-conforming grid. In finite element methods such situation 
requires additional construction provided by the mortar element methods. 
Mortar element methods were introduced by Patera and his associates, see 
e.g. [19,20,21,22]. These methods permit coupling discretizations of different 
types in non-overlapping domains. Such methods are fairly complicated and 
involve, for example, the introduction of interface conditions through an L 2 
minimization. In our approach we do not face these issues at all and do not 
have to introduce any additional interface conditions. The proper construc- 
tion for adjoining boxes is taken care by the redundant tree data structure 
and Step 2 of Algorithm 3 for applying the kernel, which generates the neces- 
sary missing boxes on appropriate scales. 

Remark 2 Although Algorithm 3 applies convolution operators, only minor 
changes are needed to use it for non-convolutions. Of course in such case, the 
separated representation for the modified ns-form should be constructed by a 
different approach. 
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Algorithm 3 Adaptive non-standard form operator application in multiple 
dimensions (illustrated for d — 2), g — Tf 

Initialization: Construct the redundant tree for / and copy it as skeleton 

tree for g (see Section 2.2.1). 

for all scales j — 0, . . . , n — 1 do 

for all function blocks gj ll2 in the tree for g at scale j do 

# Step 1. Determine the list of all Core, Boundary-1 and Boundary-2 
contributing operator blocks of the modified ns-form F^' ,m ' ,ll ~ 1 ^, F^' ,m ' ,l2 ~ 1 ^ 
(see Section 4 )'■ 

if gj il2 belongs to a leaf then 

Read operator blocks F j '' m > ll ~ l 'i, F j ' m ' l2 ~ l 2 for Core, Boundary-1 and 
Boundary-2, and their weights w m and corresponding ranges from 
the separated representation. 

else 

Read operator blocks F j ' m ' ll ~ l i, F j ' m > l2 ~ l 2 for Boundary-1 and 
Boundary-2, and their weights w m and corresponding ranges from 
the separated representation, 
end if 

# Step 2. Find the required blocks fyy of the input function f : 

if function block exists in the redundant tree for / then 

Retrieve it. 
else 

Create it by interpolating from a coarser scale and cache for reuse, 
end if 

# Step 3. Output function block computation: 

For each set S of indices determined in Step 1 (Core, Boundary-1, 
Boundary-2) and the corresponding ranges of terms in the separated 
representation, compute the sum 

& = E *>m E F j ' ,m '' k - l[ E ^ ;m; ' 2 "%, , 

'l '2 

Add all computed sums to obtain gj ll2 . 
end for 
end for 

# Step 4- Adaptive projection: 

Project resulting output function blocks gj il2 on all scales into a proper 
adaptive tree by using Eq. (5). 

Discard from the resulting tree unnecessary function blocks at the requested 
accuracy. 

Return: the function g represented by its adaptive tree. 
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5.1 Operation count estimates 



Adaptive decomposition of functions 



The cost of adaptively decomposing a function in d dimensions is essentially 
that of an adaptive wavelet transform. Specifically, it takes 0(N h \ ocks ■ p d+l ) 
operations to compute such representation, where A^iocks is the final number 
of significant blocks in the representation and p is the order of multiwavelet 
basis chosen. In comparison with the usual wavelet transform, it appears to 
be significantly more expensive. However, these 0{p d+1 ) operations process 
0{p d ) points, thus in counting significant coefficients as it is done in the usual 
adaptive wavelet transform, we end up with 0(p) operations per point. 



Operator application 

The cost of applying an operator in the modified ns-form is 0(N hlocks Mp d+1 ) , 
where A^ Wocks is the number of blocks in the adaptive representation of the 
input function, M is the separation rank of the kernel in (15) and p is the 
order of the multiwavelet basis. For a given desired accuracy e, we typically 
select p oc loge -1 ; M has been shown to be proportional to (loge -1 ) 1 ', where v 
depends on the operator [24]. In our numerical experiments, M is essentially 
proportional to loge" 1 , since we never use the full separated representation, 
as discussed in Section 4 and illustrated in Figure 5. 

This operation count can be potentially reduced to 0(N b \ ocks Mp d ) by using 
the structure of the matrices in 18, and we plan to address this in the future. 



Final projection 



After the operator has been applied to a function in a scale-independent fash- 
ion, a final projection step is required as discussed in Section 2.3. This step 
requires C(A^i cks ~P d ) operations, the same as in the original function decom- 
position. In practice, this time is negligible compared to the actual operator 
application. 
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6 Numerical examples 



6.1 The Poisson equation 

We illustrate the performance of the algorithm by solving the Poisson equation 
in d = 3 

V 2 0(r) = -p(r) (24) 
with free space boundary conditions, </>(r) — > and d(f>/dr — > as r — > oo. We 
write the solution as 




and adaptively evaluate this integral. We note that our method can equally 
be used for d = 2, since the corresponding Green's function can also be rep- 
resented as a sum of Gaussians, and the operator application algorithm has 
been implemented in for both d = 2 and d = 3. 

For our test we select 

0(r)=f e^l-'l 2 , 

i=l 

so that we solve the Poisson equation with 

3 

p(r) = -V 2 0(r) = - ^(4a 2 |r - r,| 2 - 6a)e" a|r " ri1 . 

i=l 

Our parameters are chosen as follows: a = 300, r± = (0.5,0.5,0.5), r 2 = 
(0.6,0.6,0.5) and r 3 = (0.35,0.6,0.5). These ensure that p(r) is well below 
our requested thresholds on the boundary of the computational domain. All 
numerical experiments were performed on a Pentium-4 running at 2.8 GHz, 
with 2 GB of RAM. The results are summarized in Table 2. 

In order to gauge the speed of algorithm in reasonably computer-independent 
terms, we use a similar approach to that of [17] and also provide timings of 
the Fast Fourier Transform (FFT). Specifically, we display timings for two 
FFTs as an estimate of the time needed to solve the Poisson equation with 
a smooth right hand side and periodic boundary conditions in a cube. As in 
[17], we compute the rate that estimates the number of processed points per 
second. We observe that for our adaptive algorithm such rate varies between 
3.4 • 10 4 and 1.1 • 10 5 (see Table 2), whereas for the FFTs it is around 10 6 (see 
Table 3). We note that our algorithm is not fully optimized, namely, we do not 
use the structure of the matrices in (18) and the symmetries afforded by the 
radial kernels. We expect a substantial impact on the speed by introducing 
these improvements and will report them separately. 
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Requested e = 10 



P 


E 2 




Time (s) 


Rate (pts/s) 


6 


5.0 • 10" 3 


7.9 • 10" 1 


1.2 


7.2 • 10 4 


8 


1.7-10- 3 


1.2 • 10" 1 


0.51 


7.3 • 10 4 


10 


4.4 • lO" 4 


3.7- 10~ 2 


0.68 


1.1 • 10 5 



Requested e = 10 



p 


E 2 


Eoo 


Time (s) 


Rate (pts/s) 


10 


4.7- 10~ 6 


3.6 • 10~ 4 


10.3 


5.7- 10 4 


12 


8.5 • 10~ 6 


4.3 • 10~ 5 


13.5 


7.5 • 10 4 


14 


6.9 • 10~ 8 


5.2 • 10~ 6 


20.0 


8.0 • 10 4 



Requested e = 10 



p 


E 2 


Eoo 


Time (s) 


Rate (pts/s) 


16 


2.5 • 10~ 10 


2.2 • 10~ 8 


68.1 


3.5 • 10 4 


18 


7.7- lO" 11 


3.5 • 10" 9 


100.3 


3.4 • 10 4 


20 


1.2 • lO" 10 


1.8 • 10~ 8 


133.4 


3.5 • 10 4 



Table 2 

Accuracy and timings for the adaptive solution of the Poisson equation in (24). 



size 


Time (s) 


Rate (pts/s) 


32 3 


0.02 


1.7- 10 6 


64 3 


0.22 


1.2 • 10 6 


128 3 


2.21 


9.5 • 10 5 


256 3 


20.7 


8.1 • 10 5 



Table 3 

Timings of two 3D FFTs to estimate the speed of a non-adaptive, periodic Poisson 
solver on a cube for smooth functions. 



We note that the multigrid method (see e.g. [28,29]) is frequently used as a 
tool for solving the Poisson equation (and similar problems) in differential 
form. The FFT-based gauge suggested in [17] is useful for comparisons with 
these algorithms as well. 
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Figure 6. A two-dimensional slice of the three-dimensional subdivision of space by 
the scaling functions and an illustration of the source term for the Poisson equation 
(24). 

6.2 The ground state of the hydrogen atom 

A simple example of computing the ground state of the hydrogen atom illus- 
trates the numerical performance of the algorithms developed in this paper, 
and their utility for constructing more complex codes. The eigenfunctions 
-0 for the hydrogen atom satisfy the time-independent Schrodinger equation 
(written in atomic units and spherical coordinates), 

-Ia^-±.1> = EiI>, (25) 

where E is the energy eigenvalue. For the ground state, E = —1/2 and the 
(unnormalized) wave function is ip = e~ r . Following [30], we write 

= -2G^0, (26) 

where G M = (—A + /x 2 Z) _1 is the Green's function for some /x and V — —1/r 
is the nuclear potential. For /x = \J —IE the solution of (26) has ||0||2 = 1 
and coincides with that of (25). We solve (26) by a simple iteration starting 
from some value /xo and changing /x to obtain the solution with ||0||2 = 1- The 
algorithm proceeds as follows: 

(1) Initialize with some value /xq and function 0. The number of iterations of 
the algorithm is only weakly sensitive to these choices. 

(2) Compute the product of the potential V and the function <fi. 

(3) Apply the Green's function to the product V(j) via the algorithm of 
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Figure 7. Convergence of the iteration to obtain the ground state of the hydrogen 
atom computed via formulation in (26) for the non-relativistic Schrodinger equation. 
The requested accuracy in applying the Green's function is set to 10 -6 . 

this paper to compute 

4>new = -2G^V(j). 

(4) Compute the energy for 4> ne w, 

new: ^0neui) (V *Pnewi 4^ new) 
t^new — 77 7 \ ■ 

yrnewi (Pnew) 

(5) Set n = y/-2E new , 4> = (j>new/\\<f>new\\ and return to Step 2. 

The iteration is terminated as the change in ji and ||0|| — 1 falls below the 
desired accuracy. The progress of the iteration is illustrated in Figure 7. The 
computations in Steps 2 and 4 use the three dimensional extension of the 
approach described in [13], to compute point-wise multiplications of adaptively 
represented functions and weak differential operators of the same. 

This example illustrates an application of our algorithm to problems in quan- 
tum chemistry. Multiresolution quantum chemistry developed in [8,9,10,11] 
also uses separated representations. The main technical difference with [8,9,10,11] 
is that we use the modified non-standard form and apply operator blocks on 
their "natural" scale, thus producing a fully adaptive algorithm. We are cur- 
rently using this algorithm as part of a new method for solving the multipar- 
ticle Schrodinger equation and will report the results elsewhere. 



7 Discussion and conclusions 



We have shown that a combination of separated and multiresolution repre- 
sentations of operators yields a new multidimensional algorithm for applying 
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a class of integral operators with radial kernels. We note that the same algo- 
rithm is used for all such operators as they are approximated by a weighted 
sum of Gaussians. This fact makes our approach applicable across multiple 
fields, where a single implementation of the core algorithm can be reused for 
different specific problems. The algorithm is fully adaptive and avoids issues 
usually addressed by mortar methods. 

The method of approximation underlying our approach is distinct from that of 
the FMM, has similar efficiency and has the advantage of being more readily 
extendable to higher dimensions. We also note that semi-analytic approxima- 
tions via weighted sums of Gaussians provide additional advantages in some 
applications. Although we described the application of kernels in free space, 
there is a simple extension to problems with radial kernels subject to periodic, 
Dirichlet or Neumann boundary conditions on a cube that we will describe 
separately. 

The algorithm may be extended to classes of non-convolution operators, e.g., 
the Calderon-Zygmund operators. For such extensions the separated represen- 
tation may not be available in analytic form, as it is for the operators of this 
paper, and may require a numerical construction. The separated representa- 
tion of the kernel permits further generalization of our approach to dimensions 
d 3> 3 for applying operators to functions in separated representation. 

A notable remaining challenge is an efficient, high order extension of this tech- 
nique to the application of operators on domains with complicated geometries 
and surfaces. 



8 Appendix 



8. 1 Scaling functions 



We use either the Legendre polynomials P , . . . , P p -\ or the interpolating poly- 
nomials on the Gauss-Legendre nodes in [—1, 1] to construct an orthonormal 
basis for each subspace V, [12,13]. 

Let us briefly describe some properties of the Legendre scaling functions <pk, 
k — 0, . . . ,p — 1, defined as 

V2k + lP k (2x-l),xe[0,l] 
M x ) = \ , (27) 

o, x i [o, i] 

and forming a basis for V . The subspace V, is spanned by 2 J p functions 
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obtained from 0o> • ■ • , by dilation and translation, 

4? kl {x) = 2 J / 2 0fc(2 J x -I), k = 0, . . . ,p - 1, / = 0, . . . , 2 j - 1. (28) 

These functions have support on [2~H,2~i(l + 1)] and satisfy the orthonor- 
mality condition 

/oo 
<foi{?Wm>( x ) dx = bkk'hv • (29) 
-oo 

A function /, defined on [0, 1], is represented in the subspace V 3 - by its nor- 
malized Legendre expansion 

/(^EE^W, (so) 

1=0 k=0 

where the coefficients are computed via 

sh= / . f{xW kl {x)dx. (31) 

J2-H 

As long as /(x) is smooth enough and is well approximated on [2~H, 2" J (/ + 1)] 
by a polynomial of order up to 2p — 1, we may use Gauss-Legendre quadratures 
to calculate the s 3 kl via 

4i = 2-^ 2 X)/(2^(x i + O)0*(^K (32) 

i=0 

where x , . . . , x v _\ are the roots of P p (2x — 1) and w 0) . . . , w p _i are the corre- 
sponding quadrature weights. 

In more than one dimension, the above formulas are extended by using a tensor 
product basis in each subspace. For example, in two dimensions equation (30) 
becomes 

f(x,x') = EEE E ^mMxWk'A*')- (33) 

1=0 fc=0 l'=0 k'=0 

8. 2 Multiwavelets 

We use piecewise polynomial functions ipo, ■ ■ ■ , i>p-i as an orthonormal basis 
for W [12,13], 

/ ifj i (x)ip j (x)dx = 5ij. (34) 

J 

Since W _L V , the first p moments of all . . . , ip p -\ vanish: 

/ tl>i(x)x l dx = 0, i,j — 0,l,...,p—l. (35) 
Jo 
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The space Wj is spanned by 2 J p functions obtained from ipo, ■ ■ ■ ,4> P -\ by 
dilation and translation, 

^ kl (x) = V' 2 ^x - 1), fc = 0,...,p-l, l = 0,...,V-l, (36) 

and supported in the interval Iji = [2 -J 7, 2 -J '(Z + 1)]. A function f(x) defined 
on [0, 1] is represented in the multiwavelet basis on n scales by 

p-1 n-12J-lp-l 

/(*) = E + E E E (3?) 

fc=0 j=0 1=0 k=0 

with the coefficients d kl computed via 

di,= / f(xUUx)dx. (38) 



5.5 Two-scale relations 

The relation between subspaces, V ©W = V 1; is expressed via the two-scale 
difference equations, 

p-i 

fc (x) = E (42'<M2x) + 4lW(2x - 1)) , fc = 0, . . . ,p - 1, (39) 

fe'=0 

p-i 

^fc(x) = v/2 E ($iMte) + g^Uk'^x - 1)) , fc = 0, . . . ,p - 1, (40) 

fe'=0 

where the coefficients h[f , h\f and gf^ , depend on the type of polyno- 
mial basis used (Legendre or interpolating) and its order p. The matrices of 
coefficients 

tf (0) = {/& # (1) = {/42}, G< 0) = ^) = {,«} 

are the multiwavelet analogues of the quadrature mirror filters in the usual 
wavelet construction, e.g., [15]. These matrices satisfy a number of important 
orthogonality relations and we refer to [13] for complete details, including the 
construction of the H, G matrices themselves. Let us only state how these 
matrices are used to connect the scaling s kl and wavelet d kl coefficients on 
neighboring scales j and j + 1. The decomposition procedure (j + 1 — - > j) is 
based on 
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p-1 

6 fc£ — \ n kk lb k,2l + a kk' b k,2l+l > l^-U 



fc'=0 
p-1 

(0) J+l (1) j+l 



— E \9kk' s k% + #fcfc' s fc^2Z+l) ) (42) 

the reconstruction (j — > j + 1) is based on 



S^fi^ + i^), (43) 

fc'=0 

4S + i = E + • (44) 

fc'=0 



5.^ Cross- correlation of the scaling functions 



For convolution operators we only need to compute integrals with the cross- 
correlation functions of the scaling functions, 

/oo 
(j>i(x + y)(j>i>(y)d,y. (45) 
-oo 

Since the support of the scaling functions is restricted to [0, 1], the functions 
are zero outside the interval [—1,1] and are polynomials on [—1,0] and 
[0, 1] of degree i + i' + 1, 



$+,(x), 0<z<l, 



where i, %' — 0, . . . ,p — 1 and 



<5>i,,(x), -l<x <0, (46) 



0, 1 < \x\ 



cl — x rl 

®u'( x ) = I </>i(x + y)My)dy, ®iA x ) = / + v) Mv)dy ■ (47) 







We summarize relevant properties of the cross-correlation functions in 
Proposition 3 

(1) Transposition of indices: &u>(x) = (— iy +l ' &i'i(x) , 

(2) Relations between $ + and $~; $^,(-x) = {-l) i+i> ^^{x) for < x < 1, 
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(3) Values at zero: $^(0) = ifi^i', and $u(0) = 1 for i = 0, . . . ,p - 1, 

(4) Upper bound: max xe [^i :1 ]\^ii>(x)\ < 1 for i, i' = 0, . . . ,p — 1, 

(5) Connection with the Gegenbauer polynomials: 

= |c{" 1/2) (2x - 1) + \ and $+(x) = \ v^ + TC^ 1/2) (2z - 1), 
for 1 = 1,2,..., where C^ +1 is the Gegenbauer polynomial, 

(6) Linear expansion: if i' > i then we have 

E (48) 

J=i'-t 

4/(1 + 1) Jo 1 $+ (x)$+(x)(l - (2x - l) 2 )" 1 ^, i' > i, 

4 = < 

41(1 + 1) jftS+C*) - $ + o(x)) $+(x)(l - (2s - 1) 2 )-Mx, i' = i, 

(49) 

/or / > 1 and c% = 5u> . 

(7) Vanishing moments: we have j\& 00 (x) dx = 1 and j\ x k <&u>(x) dx = 
for i + %' > 1 and 0<k<i + i' — 1 . 

Proof of these properties can be found in [25]. 



5.5 Separated representations of radial functions 



As an example, consider approximating the function l/r a by a collection of 
Gaussians. The number of terms needed for this purpose is mercifully small. 
We have [24] 

Proposition 4 For any a > 0, < 5 < 1, and < e < min{|,^}, there 
exist positive numbers r m and w m such that 

< r- a e, for all 5<r<l (50) 

with 

M = loge _1 [co + ci loge" 1 + c 2 log cT 1 ], (51) 

where are constants that only depend on a. For fixed power a and accuracy 
e, we have M = 0(log5 _1 ). 
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The proof of Proposition 4 in [24] is based on using the trapezoidal rule to dis- 
cretize an integral representation of l/r Q . Similar estimates may be obtained 
for more general radial kernels using their integral representations as in (14). 

We note that approximations of the function 1/r via sums of Gaussians have 
been also considered in [31,32,33]. 



8. 6 Estimates 



By selecting appropriate e and 5 in the separated representations of a radial 
kernel K (as in Proposition 4), we obtain a separated approximation for the 
coefficients 

= 2- 3j / w K(2 '(x ■ /))<I-(.r ; )<I-.(,- 2 )<I. H ,(,- ;5 ),/x. (52) 

Since the number of terms, M, depends logarithmically on e and 5, we achieve 
any finite accuracy with a very reasonable number of terms. For example, for 
the Poisson kernel K(r) = l/iirr, we have the following estimate [25], 

Proposition 5 For any e > and < 5 < 1 the coefficients t{f jj, kh i in (52) 
have an approximation with a low separation rank, 

M 

J'^ _ \^ „,, rpm,h rpm,l 2 T?m,l 3 

'ii',jj',kk' — W m r U' r jj> r kk' 1 

m=l 

such that «/maxj > 2, then 

\Ai',jj',kk' ~ r ii',jj',kk'\ — c °2 2Je ' (54) 
and i/maxj < 1, then 

\4 e , jf ,kk' ~ r ii"jf,kk'\ < 2^M 2 + c 6) (55) 

where e, 5, M , r m , w m , m = 1, . . . , M are described in Proposition 4 for a = 1 
and Co and c\ are (small) constants. 

As described in Section 4, our adaptive algorithm selects only some of the 
terms, as needed on a given scale for the desired accuracy e. 
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8.7 Evaluation of integrals with the cross- correlation functions 

We need to compute integrals in (18), where the cross-correlation functions 
are given in (45). We note that using (48), it is sufficient to compute 

for < i < 2p— 1 rather than p 2 integrals in (18). Using the relation between 
$~ and $ + in Proposition 3, we have 

[° $r ( x )e- T ™( x +Q 2 dx= f 1 $ M (-x)e- Tm{ - x+l)2 dx = (-lV ®fJx)e- Tm{x - l)2 dx, 
J-i Jo Jo 

so that 

Jo 
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